December 5, 2016
Source: College Scorecard, a federally maintained dataset
Outcome Statistics by Cohort:
Explanatory Statistics for each school:
Are "outcome gaps" between groups related to the type of school or expenditures?
Compute gap metrics as percent difference in outcomes
Conduct hypothesis testing
Are other variables related to outcome gaps?
Trend looks generally negative, but with a lot of variance
CSUs have significant completion rate gaps, but UCs and private elites have similar outcome gaps despite differences in expenditures.
Interestingly, the gap between middle income and high income students seems to worsen with higher expenditures.
Negative values indicate that low income students actually tend to outperform middle income students
The data looks fairly random. Possibly a negative trend, but fairly weak-looking.
Similar patterns. It does seem that at higher expenditures, there is less variance in completion gaps.
There definitely seem to be differences in means between the three groups, but the distributions are similar.
This correlation for high-low income earnings gaps looks much stronger.
Correlation with high-middle income gaps is less dramatic but still seems to have some negative trend.
This pattern seems fairly flat, maybe a slight positive trend.
Little evidence of significance for the difference in white-black completion rate gap between types of schools.
Here we see evidence that for-profit schools have significantly different white-hispanic completion gaps.
We see a similar pattern with the white-Asian completion gap difference.
Differences looks more significant here.
Less significant in comparison to High-Low.
Differences are less significant. We do see the trends from the past two plots reverse.
Hypothesis: coefficients regressing outcome gaps on expenditures will be negative
Slopes are mostly insignificantly different from zero. Many are positive - Hypothesis rejected
Extremely significant negative coefficient for high-low, others less significant.
Consider relationships and relative importance of other possibly relevant variables, relating to:
INEXPFTE is quite important compared to other terms. Diversity also seems relevant.
INEXPFTE is important again. Economic variables are more important in these models.
Completion Rate differences across Races did not seem to correspond strongly with type of school or expenditures
Signficant evidence of Earnings Gaps across economic backgrounds corresponding with expenditures
No accounting for multiple testing
Of many possible variables, Random Forest analysis suggests INEXPFTE is more important than others, but CONTROL is unimportant